Table of Contents

  • 0. Preparation
    • 0.1 Install / Load Packages
  • 3 Grover's Algorithm
    • 3.1 Introduction
      • 3.1.1 Unstructured Search
      • 3.1.2 Creating an Oracle
      • 3.1.3 Amplitude Amplification
    • 3.2 Example: 2 Qubits
      • 3.2.1 2 Qubits Grover's algorithm circuit
      • 3.2.2 Qiskit Implementation
      • 3.2.3 Experiment with Simulators
    • 3.3 Example: 3 Qubits
      • 3.3.1 3 Qubits Grover's algorithm circuit
      • 3.3.2 Qiskit Implementation
      • 3.3.3 Experiment with Simulators
  • 4. Quantum Algorithms for Application
    • 4.1 Regression
      • 4.1.1 Regression with an EstimatorQNN
      • 4.1.2 Regression with the Variational Quantum Regressor (VQR)
  • 5. Xanadu - PennyLane
    • 5.1 PennyLane Basics
      • 5.1.1 Building Quantum circuits
    • 4.2 Variational classifier using PennyLane
      • 4.2.1 Fitting the parity function
        • 4.2.1.1 Quantum and classical nodes
        • 4.2.1.2 Cost
        • 4.2.1.3 Optimization
      • 4.2.2 Dataset: Iris and its classification
      • 4.2.2.1 Quantum and classical nodes
      • 4.2.2.2 Data: Iris (붓꽃 자료)
      • 4.2.2.3 Data preparation to load to quantum circuit
      • 4.2.2.4 Optimization
      • 4.2.2.5 Result
    • 4.3 Quantum Convolutional Neural Network (QCNN) using PennyLane
      • 4.3.1 Introduction: Classical convolution vs. Quantum convolution
      • 4.3.2 What is happening to qubits when we are optimizing a quantum circuit?
      • 4.3.3 Loading packages / data:
      • 4.3.4 Setting of the main hyper-parameter of the model
      • 4.3.5 Loading MNIST dataset from tesnorflow-keras
      • 4.3.6 Quantum circuit as a convolution kernel
      • 4.3.7 Quantum pre-processing of the dataset
      • 4.3.8 Plotting the effect of QCNN of the data
      • 4.3.9 Hybrid quantum-classical model
      • 4.3.10 Training the model
      • 4.3.11 Prediction and model evaluation
    • 4.4 QCNN in development / research level
      • 4.4.1 QCNN models scheme
      • 4.4.2 Data Encoding: classical data into quantum data
      • 4.4.3 Convolutional filter and Pooling layer
      • 4.4.4 Results of QCNN model

0. Preparation¶

0.1 Install / Load Packages¶

  • We need to install or load packages before running the code below.
In [1]:
# Install packages

# !pip install numpy
# !pip install matplotlib
# !pip install qiskit
# !pip install qiskit[visualization]
# !pip install qiskit[optimization]
# !pip install qiskit[machine-learning]
In [2]:
# Load packages
## General tools
import numpy as np
import matplotlib.pyplot as plt
from math import pi

## Qiskit Circuit Functions
from qiskit import *
from qiskit.quantum_info import *
from qiskit.visualization import *

3 Grover's Algorithm¶

  • We will be constructing Grover's Algorithm (also known as Grover's search algorithm).

  • Link: Qiskit - Grover's Algorithm

3.1 Introduction¶

  • You have likely heard that one of the many advantages a quantum computer has over a classical computer is its superior speed searching databases. Grover's algorithm demonstrates this capability.
  • This algorithm can speed up an unstructured search problem quadratically, but its uses extend beyond that; it can serve as a general trick or subroutine to obtain quadratic run time improvements for a variety of other algorithms.
  • This is called the amplitude amplification trick.

3.1.1 Unstructured Search¶

  • Suppose you are given a large list of $N=2^n$ items. Among these items there is one item with a unique property that we wish to locate; we will call this one the winner $w$. Think of each item in the list as a box of a particular color. Say all items in the list are gray except the winner $w$, which is purple.

grover_list.png

  • To find the purple box the marked item using classical computation, one would have to check on average $N/2$ of these boxes, and in the worst case, all $N$ of them. On a quantum computer, however, we can find the marked item in roughly $\sqrt{N}$ steps with Grover's amplitude amplification trick.
  • A quadratic speedup is indeed a substantial time-saver for finding marked items in long lists. Additionally, the algorithm does not use the list's internal structure, which makes it generic; this is why it immediately provides a quadratic quantum speed-up for many classical problems.

3.1.2 Creating an Oracle¶

  • For the examples in this textbook, our 'database' is comprised of all the possible computational basis states our qubits can be in. For example, if we have 3 qubits, our list is the states $|000\rangle, |001\rangle, \dots |111\rangle$ (i.e the states $|0\rangle \rightarrow |7\rangle$).

  • Oracle can be considered as an 'all knowing device" but we do not access to its informations or structures.

  • Grover’s algorithm solves oracles that add a negative phase to the solution states. I.e. for any state $|x\rangle$ in the computational basis:

$$ U_\omega|x\rangle = \bigg\{ \begin{aligned} \phantom{-}|x\rangle \quad \text{if} \; x \neq \omega \\ -|x\rangle \quad \text{if} \; x = \omega \\ \end{aligned} $$
  • This oracle will be a diagonal matrix, where the entry that correspond to the marked item will have a negative phase. For example, if we have three qubits and $\omega = \text{101}$, our oracle will have the matrix:
$$ U_\omega = \begin{bmatrix} 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & -1 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 \\ \end{bmatrix} \begin{aligned} \\ \\ \\ \\ \\ \\ \leftarrow \omega = \text{101}\\ \\ \\ \\ \end{aligned} $$
  • What makes Grover’s algorithm so powerful is how easy it is to convert a problem to an oracle of this form. There are many computational problems in which it’s difficult to find a solution, but relatively easy to verify a solution.
  • For these problems, we can create a function $f$ that takes a proposed solution $x$, and returns $f(x) = 0$ if $x$ is not a solution ($x \neq \omega$) and $f(x) = 1$ for a valid solution ($x = \omega$). Our oracle can then be described as:
$$ U_\omega|x\rangle = (-1)^{f(x)}|x\rangle $$
  • And the oracle's matrix will be a diagonal matrix of the form:
$$ U_\omega = \begin{bmatrix} (-1)^{f(0)} & 0 & \cdots & 0 \\ 0 & (-1)^{f(1)} & \cdots & 0 \\ \vdots & 0 & \ddots & \vdots \\ 0 & 0 & \cdots & (-1)^{f(2^n-1)} \\ \end{bmatrix} $$
  • Circuit Construction of a Grover Oracle (click to expand) If we have our classical function $f(x)$, we can convert it to a reversible circuit of the form:

grover_classical_reversible_oracle.PNG

If we initialize the 'output' qubit in the state $|{-}\rangle$, the phase kickback effect turns this into a Grover oracle (similar to the workings of the Deutsch-Jozsa oracle):

grover_reversible_oracle.PNG

We then ignore the auxiliary ($|{-}\rangle$) qubit.

For the next part of this chapter, we aim to teach the core concepts of the algorithm. We will create example oracles where we know $\omega$ beforehand, and not worry ourselves with whether these oracles are useful or not.

3.1.3 Amplitude Amplification¶

  • So how does the algorithm work? Before looking at the list of items, we have no idea where the marked item is. Therefore, any guess of its location is as good as any other, which can be expressed in terms of a uniform superposition: $|s \rangle = \frac{1}{\sqrt{N}} \sum_{x = 0}^{N -1} | x\rangle.$

  • If at this point we were to measure in the standard basis $\{ | x \rangle \}$, this superposition would collapse, according to the fifth quantum law, to any one of the basis states with the same probability of $\frac{1}{N} = \frac{1}{2^n}$. Our chances of guessing the right value $w$ is therefore $1$ in $2^n$, as could be expected. Hence, on average we would need to try about $N/2 = 2^{n-1}$ times to guess the correct item.

  • Enter the procedure called amplitude amplification, which is how a quantum computer significantly enhances this probability. This procedure stretches out (amplifies) the amplitude of the marked item, which shrinks the other items' amplitude, so that measuring the final state will return the right item with near-certainty.

  • This algorithm has a nice geometrical interpretation in terms of two reflections, which generate a rotation in a two-dimensional plane. The only two special states we need to consider are the winner $| w \rangle$ and the uniform superposition $| s \rangle$. These two vectors span a two-dimensional plane in the vector space $\mathbb{C}^N.$ They are not quite perpendicular because $| w \rangle$ occurs in the superposition with amplitude $N^{-1/2}$ as well. We can, however, introduce an additional state $|s'\rangle$ that is in the span of these two vectors, which is perpendicular to $| w \rangle$ and is obtained from $|s \rangle$ by removing $| w \rangle$ and rescaling.

  • Step 1: The amplitude amplification procedure starts out in the uniform superposition $| s \rangle$, which is easily constructed from $| s \rangle = H^{\otimes n} | 0 \rangle^n$.

grover_step1.jpg

The left graphic corresponds to the two-dimensional plane spanned by perpendicular vectors $|w\rangle$ and $|s'\rangle$ which allows to express the initial state as $|s\rangle = \sin \theta | w \rangle + \cos \theta | s' \rangle,$ where $\theta = \arcsin \langle s | w \rangle = \arcsin \frac{1}{\sqrt{N}}$. The right graphic is a bar graph of the amplitudes of the state $| s \rangle$.

  • Step 2: We apply the oracle reflection $U_f$ to the state $|s\rangle$.

grover_step2.jpg

Geometrically this corresponds to a reflection of the state $|s\rangle$ about $|s'\rangle$. This transformation means that the amplitude in front of the $|w\rangle$ state becomes negative, which in turn means that the average amplitude (indicated by a dashed line) has been lowered.

  • Step 3: We now apply an additional reflection ($U_s$) about the state $|s\rangle$: $U_s = 2|s\rangle\langle s| - \mathbb{1}$. This transformation maps the state to $U_s U_f| s \rangle$ and completes the transformation.

grover_step3.jpg

Two reflections always correspond to a rotation. The transformation $U_s U_f$ rotates the initial state $|s\rangle$ closer towards the winner $|w\rangle$. The action of the reflection $U_s$ in the amplitude bar diagram can be understood as a reflection about the average amplitude. Since the average amplitude has been lowered by the first reflection, this transformation boosts the negative amplitude of $|w\rangle$ to roughly three times its original value, while it decreases the other amplitudes. We then go to step 2 to repeat the application. This procedure will be repeated several times to zero in on the winner.

After $t$ steps we will be in the state $|\psi_t\rangle$ where: $| \psi_t \rangle = (U_s U_f)^t | s \rangle.$

How many times do we need to apply the rotation? It turns out that roughly $\sqrt{N}$ rotations suffice. This becomes clear when looking at the amplitudes of the state $| \psi \rangle$. We can see that the amplitude of $| w \rangle$ grows linearly with the number of applications $\sim t N^{-1/2}$. However, since we are dealing with amplitudes and not probabilities, the vector space's dimension enters as a square root. Therefore it is the amplitude, and not just the probability, which is being amplified in this procedure.

In the case that there are multiple solutions, $M$, it can be shown that roughly $\sqrt{(N/M)}$ rotations will suffice.

grover_circuit_high_level_design.png

3.2 Example: 2 Qubits¶

3.2.1 2 Qubits Grover's algorithm circuit¶

  • Let's first have a look at the case of Grover's algorithm for $N=4$ which is realized with 2 qubits. In this particular case, only one rotation is required to rotate the initial state $|s\rangle$ to the winner $|w\rangle$:

    • 1) Following the above introduction, in the case $N=4$ we have $$\theta = \arcsin \frac{1}{2} = \frac{\pi}{6}.$$

    • 2) After $t$ steps, we have $$(U_s U_\omega)^t | s \rangle = \sin \theta_t | \omega \rangle + \cos \theta_t | s' \rangle ,$$where $$\theta_t = (2t+1)\theta.$$

    • 3) In order to obtain $| \omega \rangle$ we need $\theta_t = \frac{\pi}{2}$, which with $\theta=\frac{\pi}{6}$ inserted above results to $t=1$. This implies that after $t=1$ rotation the searched element is found.

We will now follow through an example using a specific oracle.

Oracle for $\lvert \omega \rangle = \lvert 11 \rangle$

  • Let's look at the case $\lvert w \rangle = \lvert 11 \rangle$. The oracle $U_\omega$ in this case acts as follows:
$$U_\omega | s \rangle = U_\omega \frac{1}{2}\left( |00\rangle + |01\rangle + |10\rangle + |11\rangle \right) = \frac{1}{2}\left( |00\rangle + |01\rangle + |10\rangle - |11\rangle \right).$$

or:

$$ U_\omega = \begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & -1 \\ \end{bmatrix} $$

which you may recognize as the controlled-Z gate. Thus, for this example, our oracle is simply the controlled-Z gate:

grover_circuit_2qbuits_oracle_11.PNG

Reflection $U_s$

  • In order to complete the circuit we need to implement the additional reflection $U_s = 2|s\rangle\langle s| - \mathbb{1}$. Since this is a reflection about $|s\rangle$, we want to add a negative phase to every state orthogonal to $|s\rangle$.

  • One way we can do this is to use the operation that transforms the state $|s\rangle \rightarrow |0\rangle$, which we already know is the Hadamard gate applied to each qubit:

$$H^{\otimes n}|s\rangle = |0\rangle$$
  • Then we apply a circuit that adds a negative phase to the states orthogonal to $|0\rangle$:
$$U_0 \frac{1}{2}\left( \lvert 00 \rangle + \lvert 01 \rangle + \lvert 10 \rangle + \lvert 11 \rangle \right) = \frac{1}{2}\left( \lvert 00 \rangle - \lvert 01 \rangle - \lvert 10 \rangle - \lvert 11 \rangle \right)$$

i.e. the signs of each state are flipped except for $\lvert 00 \rangle$. As can easily be verified, one way of implementing $U_0$ is the following circuit:

grover_circuit_2qbuits_reflection_0.PNG

  • Finally, we do the operation that transforms the state $|0\rangle \rightarrow |s\rangle$ (the H-gate again):
$$H^{\otimes n}U_0 H^{\otimes n} = U_s$$

The complete circuit for $U_s$ looks like this:

grover_circuit_2qbuits_reflection.PNG

Full Circuit for $\lvert w \rangle = |11\rangle$ Since in the particular case of $N=4$ only one rotation is required we can combine the above components to build the full circuit for Grover's algorithm for the case $\lvert w \rangle = |11\rangle$:

grover_circuit_2qubits_full_11.PNG

3.2.2 Qiskit Implementation¶

  • We now implement Grover's algorithm for the above case of 2 qubits for $\lvert w \rangle = |11\rangle$.
In [3]:
# Initialization
import matplotlib.pyplot as plt
import numpy as np

# Importing Qiskit
from qiskit import IBMQ, Aer, assemble, transpile
from qiskit import QuantumCircuit, ClassicalRegister, QuantumRegister
from qiskit.providers.ibmq import least_busy

# Import basic plot tools
from qiskit.visualization import plot_histogram
  • We start by preparing a quantum circuit with two qubits:
In [4]:
n = 2
grover_circuit = QuantumCircuit(n)
  • Then we simply need to write out the commands for the circuit depicted above. First, we need to initialize the state $|s\rangle$. Let's create a general function (for any number of qubits) so we can use it again later:
In [5]:
def initialize_s(qc, qubits):
    """Apply a H-gate to 'qubits' in qc"""
    for q in qubits:
        qc.h(q)
    return qc
In [6]:
grover_circuit = initialize_s(grover_circuit, [0,1])
grover_circuit.barrier()
grover_circuit.draw(output='mpl')
Out[6]:
  • Apply the Oracle for $|w\rangle = |11\rangle$. This oracle is specific to 2 qubits:
In [7]:
grover_circuit.cz(0,1) # Oracle
grover_circuit.barrier()
grover_circuit.draw(output='mpl')
Out[7]:
  • We now want to apply the diffuser ($U_s$). As with the circuit that initializes $|s\rangle$, we'll create a general diffuser (for any number of qubits) so we can use it later in other problems.
In [8]:
# Diffusion operator (U_s)
grover_circuit.h([0,1])
grover_circuit.z([0,1])
grover_circuit.cz(0,1)
grover_circuit.h([0,1])
grover_circuit.barrier()
grover_circuit.draw(output='mpl')
Out[8]:

3.2.3 Experiment with Simulators ¶

Let's run the circuit in simulation. First, we can verify that we have the correct statevector:

In [9]:
sim = Aer.get_backend('aer_simulator')
# we need to make a copy of the circuit with the 'save_statevector'
# instruction to run on the Aer simulator
grover_circuit_sim = grover_circuit.copy()
grover_circuit_sim.save_statevector()
qobj = assemble(grover_circuit_sim)
result = sim.run(qobj).result()
statevec = result.get_statevector()

# Display the output state as a vector
display(array_to_latex(statevec, prefix="\\text{Statevector } |\\psi\\rangle = "))
# from qiskit_textbook.tools import vector2latex
# vector2latex(statevec, pretext="|\\psi\\rangle =")
$$ \text{Statevector } |\psi\rangle = \begin{bmatrix} 0 & 0 & 0 & 1 \\ \end{bmatrix} $$
  • As expected, the amplitude of every state that is not $|11\rangle$ is 0, this means we have a 100% chance of measuring $|11\rangle$:
In [10]:
grover_circuit.measure_all()

aer_sim = Aer.get_backend('aer_simulator')
qobj = assemble(grover_circuit)
result = aer_sim.run(qobj).result()
counts = result.get_counts()
plot_histogram(counts)
Out[10]:
  • We confirm that in the majority of the cases the state $|11\rangle$ is measured. The other results are due to errors in the quantum computation.

3.3 Example: 3 Qubits¶

3.3.1 3 Qubits Grover's algorithm circuit¶

We now go through the example of Grover's algorithm for 3 qubits with two marked states $\lvert101\rangle$ and $\lvert110\rangle$, following the implementation found in Reference [2]. The quantum circuit to solve the problem using a phase oracle is:

grover_circuit_3qubits.png

  • 1) Apply Hadamard gates to $3$ qubits initialized to $\lvert000\rangle$ to create a uniform superposition: $$\lvert \psi_1 \rangle = \frac{1}{\sqrt{8}} \left( \lvert000\rangle + \lvert001\rangle + \lvert010\rangle + \lvert011\rangle + \lvert100\rangle + \lvert101\rangle + \lvert110\rangle + \lvert111\rangle \right) $$

  • 2) Mark states $\lvert101\rangle$ and $\lvert110\rangle$ using a phase oracle: $$\lvert \psi_2 \rangle = \frac{1}{\sqrt{8}} \left( \lvert000\rangle + \lvert001\rangle + \lvert010\rangle + \lvert011\rangle + \lvert100\rangle - \lvert101\rangle - \lvert110\rangle + \lvert111\rangle \right) $$

  • 3) Perform the reflection around the average amplitude:

    • A. Apply Hadamard gates to the qubits $$\lvert \psi_{3a} \rangle = \frac{1}{2} \left( \lvert000\rangle +\lvert011\rangle +\lvert100\rangle -\lvert111\rangle \right) $$

    • B. Apply X gates to the qubits $$\lvert \psi_{3b} \rangle = \frac{1}{2} \left( -\lvert000\rangle +\lvert011\rangle +\lvert100\rangle +\lvert111\rangle \right) $$

    • C. Apply a doubly controlled Z gate between the 1, 2 (controls) and 3 (target) qubits $$\lvert \psi_{3c} \rangle = \frac{1}{2} \left( -\lvert000\rangle +\lvert011\rangle +\lvert100\rangle -\lvert111\rangle \right) $$

    • D. Apply X gates to the qubits $$\lvert \psi_{3d} \rangle = \frac{1}{2} \left( -\lvert000\rangle +\lvert011\rangle +\lvert100\rangle -\lvert111\rangle \right) $$

    • E. Apply Hadamard gates to the qubits $$\lvert \psi_{3e} \rangle = \frac{1}{\sqrt{2}} \left( -\lvert101\rangle -\lvert110\rangle \right) $$

  • 4) Measure the $3$ qubits to retrieve states $\lvert101\rangle$ and $\lvert110\rangle$

  • Note that since there are 2 solutions and 8 possibilities, we will only need to run one iteration (steps 2 & 3).

3.3.2 Qiskit Implementation¶

  • We now implement Grover's algorithm for the above example for 3-qubits and searching for two marked states $\lvert101\rangle$ and $\lvert110\rangle$. .
  • Note: Remember that Qiskit orders it's qubits the opposite way round to this resource, so the circuit drawn will appear flipped about the horizontal.
  • We create a phase oracle that will mark states $\lvert101\rangle$ and $\lvert110\rangle$ as the results (step 1).
In [11]:
qc = QuantumCircuit(3)
qc.cz(0, 1)
oracle_ex3 = qc.to_gate()
oracle_ex3.name = "U$_\omega$"
  • In the last section, we used a diffuser specific to 2 qubits, in the cell below we will create a general diffuser for any number of qubits.

Details: Creating a General Diffuser

Remember that we can create $U_s$ from $U_0$:

$$ U_s = H^{\otimes n} U_0 H^{\otimes n} $$

And a multi-controlled-Z gate ($MCZ$) inverts the phase of the state $|11\dots 1\rangle$:

$$ MCZ = \begin{bmatrix} 1 & 0 & 0 & \cdots & 0 \\ 0 & 1 & 0 & \cdots & 0 \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & 0 & \cdots & -1 \\ \end{bmatrix} \begin{aligned} \\ \\ \\ \leftarrow \text{Add negative phase to} \; |11\dots 1\rangle\\ \end{aligned} $$

Applying an X-gate to each qubit performs the transformation:

$$ \begin{aligned} |00\dots 0\rangle & \rightarrow |11\dots 1\rangle\\ |11\dots 1\rangle & \rightarrow |00\dots 0\rangle \end{aligned} $$

So:

$$ U_0 = - X^{\otimes n} (MCZ) X^{\otimes n} $$

Using these properties together, we can create $U_s$ using H-gates, X-gates, and a single multi-controlled-Z gate:

$$ U_s = - H^{\otimes n} U_0 H^{\otimes n} = H^{\otimes n} X^{\otimes n} (MCZ) X^{\otimes n} H^{\otimes n} $$

Note that we can ignore the global phase of -1.

In [12]:
def diffuser(nqubits):
    qc = QuantumCircuit(nqubits)
    # Apply transformation |s> -> |00..0> (H-gates)
    for qubit in range(nqubits):
        qc.h(qubit)
    # Apply transformation |00..0> -> |11..1> (X-gates)
    for qubit in range(nqubits):
        qc.x(qubit)
    # Do multi-controlled-Z gate
    qc.h(nqubits-1)
    qc.mct(list(range(nqubits-1)), nqubits-1)  # multi-controlled-toffoli
    qc.h(nqubits-1)
    # Apply transformation |11..1> -> |00..0>
    for qubit in range(nqubits):
        qc.x(qubit)
    # Apply transformation |00..0> -> |s>
    for qubit in range(nqubits):
        qc.h(qubit)
    # We will return the diffuser as a gate
    U_s = qc.to_gate()
    U_s.name = "U$_s$"
    return U_s
  • We'll now put the pieces together, with the creation of a uniform superposition at the start of the circuit and a measurement at the end. Note that since there are 2 solutions and 8 possibilities, we will only need to run one iteration.
In [13]:
n = 3
grover_circuit = QuantumCircuit(n)
grover_circuit = initialize_s(grover_circuit, [0,1,2])
grover_circuit.barrier()
grover_circuit.append(oracle_ex3, [0,1,2])
grover_circuit.barrier()
grover_circuit.append(diffuser(n), [0,1,2])
grover_circuit.measure_all()
grover_circuit.draw(output='mpl')
Out[13]:
  • To see the quantum circuits with detaileds. apply decompose() on circuit object.
In [14]:
grover_circuit.decompose().draw(output='mpl')
Out[14]:

3.3.3 Experiment with Simulators¶

  • We can run the above circuit on the simulator.
In [15]:
aer_sim = Aer.get_backend('aer_simulator')
transpiled_grover_circuit = transpile(grover_circuit, aer_sim)
qobj = assemble(transpiled_grover_circuit)
results = aer_sim.run(qobj).result()
counts = results.get_counts()
plot_histogram(counts)
Out[15]:
  • As we can see, the algorithm discovers our marked states $\lvert101\rangle$ and $\lvert110\rangle$.
In [ ]:
 

4. Quantum Algorithms for Application¶

  • We will see various types of algorithms using quantum circuits to solve adequate problems.

4.1 Regression¶

  • Regression is a statistical analysis for estimating the relationship between response variable and independent variables.
  • Simply put, we are trying to find the model that explains the best about the data.
  • Regression using quantum codes:

    • Regression with an EstimatorQNN
    • Variational Quantum Regressor (VQR)
  • Link: Neural Network Regressor

In [16]:
IBMQ.load_account()
Out[16]:
<AccountProvider for IBMQ(hub='ibm-q', group='open', project='main')>
In [17]:
from qiskit import QuantumCircuit
from qiskit.algorithms.optimizers import COBYLA, L_BFGS_B
from qiskit.circuit import Parameter
from qiskit.circuit.library import RealAmplitudes, ZZFeatureMap
from qiskit.utils import algorithm_globals

from qiskit_machine_learning.algorithms.classifiers import NeuralNetworkClassifier, VQC
from qiskit_machine_learning.algorithms.regressors import NeuralNetworkRegressor, VQR
from qiskit_machine_learning.neural_networks import SamplerQNN, EstimatorQNN
  • Load package / functions which we'll be using.
In [18]:
from qiskit.utils import QuantumInstance, algorithm_globals
from qiskit.circuit import Parameter
# from qiskit.circuit.library import RealAmplitudes, ZZFeatureMap
from qiskit.algorithms.optimizers import COBYLA, L_BFGS_B

from qiskit_machine_learning.neural_networks import TwoLayerQNN, CircuitQNN
# from qiskit_machine_learning.algorithms.classifiers import NeuralNetworkClassifier, VQC
from qiskit_machine_learning.algorithms.regressors import NeuralNetworkRegressor, VQR

from IPython.display import clear_output
In [19]:
# Loading Quantum backend - aer_simulator
quantum_instance = QuantumInstance(Aer.get_backend("aer_simulator"), shots=1024)
  • Let's create a data which we will use.
In [20]:
algorithm_globals.random_seed = 42  # setting the random_seed number

num_samples = 20  # the number of observations of the data to create
eps = 0.2  # error coefficient to be multiplicated
lb, ub = -np.pi, np.pi  # lower bound = -pi, upper bound = +pi

# Create a dataset generated from the sine function (true function values)
X_ = np.linspace(lb, ub, num=50).reshape(50, 1)
f = lambda x: np.sin(x)

# Adding noise values to x and y values from the true function values (sine function values)
X = (ub - lb) * algorithm_globals.random.random([num_samples, 1]) + lb
y = f(X[:, 0]) + eps * (2 * algorithm_globals.random.random(num_samples) - 1)
In [21]:
# Datasets
X
Out[21]:
array([[ 1.72131662],
       [-0.38403809],
       [ 2.25313718],
       [ 1.2400999 ],
       [-2.54985893],
       [ 2.98842337],
       [ 1.64078914],
       [ 1.79739504],
       [-2.33663096],
       [-0.31173435],
       [-0.81179996],
       [ 2.68144351],
       [ 0.90393121],
       [ 2.02797103],
       [-0.35553907],
       [-1.71380966],
       [ 0.34296633],
       [-2.74061701],
       [ 2.05856737],
       [ 0.82727182]])
In [22]:
y
Out[22]:
array([ 1.09192829, -0.43285708,  0.96437789,  1.10306489, -0.44644741,
        0.03042655,  0.9842399 ,  0.79195768, -0.8590883 , -0.23349027,
       -0.6276222 ,  0.63108564,  0.71609467,  0.84548709, -0.36027345,
       -1.11400247,  0.18825075, -0.40003482,  0.77414341,  0.80401304])
In [23]:
# Plotting true function values (sine function) and the generated data
plt.figure(figsize=(8,6))
plt.plot(X_, f(X_), "r--")
plt.plot(X, y, "bo")
plt.show()
  • Red dotted line: True function value $f(x) = sin(x)$, where $x \in (-\pi, \pi)$
  • Blue dots: generated data from true function values added with noisy values
  • **Our objective** is to find the true function (red dotted line) with only having / using the data we have, which is blue dots.

4.1.1 Regression with an EstimatorQNN¶

  • Here we restrict to regression with an EstimatorQNN that returns values in $\big[-1, 1\big]$.
    • Feature map: 자료의 특징을 잘 나타낼 수 있도록 사용하는 모형 설정
    • Ansatz: dependent variable의 형태 또는 해의 형태를 가정하는 풀이
  • More complex and also multi-dimensional models could be constructed, also based on SamplerQNN but that exceeds the scope of this tutorial.
In [24]:
# construct simple feature map
param_x = Parameter("x")
feature_map = QuantumCircuit(1, name="fm")
feature_map.ry(param_x, 0)

feature_map.draw(output = 'mpl')
Out[24]:
In [25]:
# construct simple ansatz
param_y = Parameter("y")
ansatz = QuantumCircuit(1, name="vf")
ansatz.ry(param_y, 0)

ansatz.draw(output = 'mpl')
Out[25]:
In [26]:
# construct a circuit
qc = QuantumCircuit(1)
qc.compose(feature_map, inplace=True)
qc.compose(ansatz, inplace=True)

qc.draw(output = 'mpl')
Out[26]:
In [27]:
# construct QNN
regression_estimator_qnn = EstimatorQNN(
    circuit=qc, input_params=feature_map.parameters, weight_params=ansatz.parameters
)
In [28]:
# callback function that draws a live plot when the .fit() method is called
def callback_graph(weights, obj_func_eval):
    clear_output(wait=True)
    objective_func_vals.append(obj_func_eval)
    plt.title("Objective function value against iteration")
    plt.xlabel("Iteration")
    plt.ylabel("Objective function value")
    plt.plot(range(len(objective_func_vals)), objective_func_vals)
    plt.show()
In [29]:
# construct the regressor from the neural network
# callback function that draws a live plot when the .fit() method is called

regressor = NeuralNetworkRegressor(
    neural_network=regression_estimator_qnn,
    loss="squared_error",
    optimizer=L_BFGS_B(maxiter=5),
    callback=callback_graph,
)
In [30]:
# create empty array for callback to store evaluations of the objective function
objective_func_vals = []
plt.rcParams["figure.figsize"] = (12, 6)

# fit to data
regressor.fit(X, y)

# return to default figsize
plt.rcParams["figure.figsize"] = (6, 4)
In [31]:
# score the result - returns the mean accuracy of the score (R^2)
regressor.score(X, y)
Out[31]:
0.9733353285953001
  • Similarly to the classification models, we can obtain an array of trained weights by querying a corresponding property of the model. In this model we have only one parameter defined as param_y above.
In [32]:
# plot target function
plt.figure(figsize=(8,6))
plt.plot(X_, f(X_), "r--")

# plot data
plt.plot(X, y, "bo")

# plot fitted line
y_ = regressor.predict(X_)
plt.plot(X_, y_, "g-")
plt.show()

4.1.2 Regression with the Variational Quantum Regressor (VQR)¶

  • VQR is a special variant of the NeuralNetworkRegressor with a OpflowQNN.

  • By default it considers the L2Loss function to minimize the mean squared error between predictions and targets.

    • Mean squared error: $MSE = \sum_{n=1}^{m}{\big(y_i-\hat{y_i}\big)^2}$
In [33]:
vqr = VQR(
    feature_map=feature_map,
    ansatz=ansatz,
    optimizer=L_BFGS_B(),
    quantum_instance=quantum_instance,
    callback=callback_graph,
)
C:\Users\user\AppData\Local\Temp\ipykernel_9996\1717641275.py:1: DeprecationWarning: The quantum_instance argument is deprecated as of version 0.5.0 and will be removed no sooner than 3 months after the release. Instead use the estimator argument.
  vqr = VQR(
In [34]:
# create empty array for callback to store evaluations of the objective function
objective_func_vals = []
plt.rcParams["figure.figsize"] = (12, 6)

# fit regressor
vqr.fit(X, y)
# return to default figsize
plt.rcParams["figure.figsize"] = (6, 4)
In [35]:
# score the result - returns the mean accuracy of the score (R^2)
vqr.score(X, y)
Out[35]:
0.9690894881891703
In [36]:
# plot target function
plt.figure(figsize=(8,6))
plt.plot(X_, f(X_), "r--")

# plot data
plt.plot(X, y, "bo")

# plot fitted line
y_ = vqr.predict(X_)
plt.plot(X_, y_, "g-")
plt.show()
In [ ]:
 

#

5. Xanadu - PennyLane¶

Xanadu_logo.png

  • Xanadu Quantum Technologies is a Canadian quantum computing hardware and software company located in Toronto, Canada.
  • While IBM Quantum uses quantum system with superconducting circuits, Xanadu's hardware is based on photonic quantum computers.
  • Both IBM Quantum and Xanadu provide cloud accessible quantum computers providing python-based codes and libraries.
  • Link: Xanadu - PennyLane

PennyLane_image.png

  • PennyLane is the leading tool for programming quantum computers. A cross-platform Python library, it enables a new paradigm — quantum differentiable programming — that enables seamless integration with machine learning tools. Train a quantum computer like you would train a neural network. PennyLane also supports a comprehensive set of features, simulators, hardware, and community-led resources that enable users of all levels to easily build, optimize and deploy quantum-classical applications.
  • One of the advantages using PennyLane is that it can easily dispatch to different quantum devices, such as IBM-Qiskit, AWS-Amazon Braket, Google-Cirq, Honeywell, IonQ, and so on.
  • "PennyLane is an open-source software framework built around the concept of quantum differentiable programming. It seamlessly integrates classical machine learning libraries with quantum simulators and hardware, giving users the power to train quantum circuits."

quantum_companies.png

  • Various types of quantum computers

    • Integrated Photonics - 광집적회로 기반의 컴퓨터
    • Superconducting Circuits - 초전도체를 기반의 양자컴퓨터
    • Ion Trapped - 이온 포획 기반의 양자컴퓨터
  • For more information, here are some useful links to check.

    • Link: PennyLane front page
    • Link: PennyLane Documentation
    • Link: PennyLane - plugins and ecosystem
  • To install PennyLane

  • Link: PennyLane installation

In [37]:
# install the latest released version of PennyLane
#!pip install pennylane
In [38]:
from pennylane import numpy as np
import matplotlib.pyplot as plt
from math import pi

import pennylane as qml

5.1 PennyLane Basics¶

5.1.1 Building Quantum circuits¶

  • Just like IBM-Q Qiskit, Xanadu provides python-based code library called PennyLane.
  • We will learn how to build quantum circuits using PennyLane codes.

  • Link: Using PennyLane - Quantum circuits

In [ ]:
 

Defining a device

  • To run—and later optimize—a quantum circuit, one needs to first specify a computational device.

  • device(name, wires=1, *args, **kwargs)

    • name: the name of the device / simulators to load.
    • wires: the number of wires to initialize. Think the wire as the number of qubits.
    • shots: the number of execution of the circuit.
In [39]:
# Setup the device using qml.device()
## Assigning the device and defining the measurement of the circuit
dev = qml.device(name = 'default.qubit', wires = 2)

@qml.qnode(dev)
def circuit():
    return qml.expval(qml.PauliZ(0) @ qml.PauliZ(1))

drawer = qml.draw(circuit, show_all_wires=True)
In [40]:
print(drawer())
0: ───┤ ╭<Z@Z>
1: ───┤ ╰<Z@Z>
In [41]:
# Setup the device using qml.device()
## Assigning names to each wires
dev = qml.device(name = 'default.qubit', wires = ['a', 'b'])

@qml.qnode(dev)
def circuit(x):
    return qml.expval(qml.PauliZ('a') @ qml.PauliZ('b'))

drawer = qml.draw(circuit, show_all_wires=True)
print(drawer(0.5))
a: ───┤ ╭<Z@Z>
b: ───┤ ╰<Z@Z>
In [42]:
# Setup the device using qml.device()
## Assigning the number of shots
shots_list = [5, 10, 1000] # number of shots for each batches
dev = qml.device("default.qubit", wires=2, shots=shots_list)

@qml.qnode(dev)
def circuit(x):
    qml.RX(x, wires=0)
    qml.CNOT(wires=[0, 1])
    return qml.expval(qml.PauliZ(0) @ qml.PauliX(1)), qml.expval(qml.PauliZ(0))

drawer = qml.draw(circuit, show_all_wires=True)

print(drawer(x = 0.5))
0: ──RX(0.50)─╭●─┤ ╭<Z@X>  <Z>
1: ───────────╰X─┤ ╰<Z@X>     
In [43]:
## The result of the execution of the circuit.
circuit(0.5)
Out[43]:
tensor([[-0.2  ,  1.   ],
        [-0.6  ,  1.   ],
        [ 0.002,  0.876]], requires_grad=True)
  • For each batches of shots, you get the measurement results in order of column direction.

Creating a quantum node (Quantum operators + measurement)

  • Together, a quantum function and a device are used to create a quantum node or QNode object, which wraps the quantum function and binds it to the device.
  • Link: PennyLane - Quantum circuits
  • Link: PennyLane - Quantum operators
In [44]:
dev = qml.device('default.qubit', wires=['aux', 'q1', 'q2'])

# Quntum function
def my_quantum_function(x, y):
    qml.RZ(x, wires='q1')
    qml.CNOT(wires=['aux' ,'q1'])
    qml.RY(y, wires='q2')
    return qml.expval(qml.PauliZ('q2'))

circuit = qml.QNode(my_quantum_function, dev)

circuit(x = np.pi/4, y = 0.7)
Out[44]:
tensor(0.76484219, requires_grad=True)
In [45]:
# Plot the circuit
print(qml.draw(circuit)(np.pi/4, 0.7))
aux: ───────────╭●─┤     
 q1: ──RZ(0.79)─╰X─┤     
 q2: ──RY(0.70)────┤  <Z>
In [46]:
# Plot the circuit with matplotlib.pyplot library
import matplotlib.pyplot as plt
fig, ax = qml.draw_mpl(circuit)(x = np.pi/4, y = 0.7)
plt.show()

Measurement

  • One of the advantage of using PennyLane is that it can extract different types of measurement results from quantum devices: the expectation of an observable, its variance, samples of a single measurement, or computational basis state probabilities..
    • Expectation value, sample from the shots, variance, probs, amplitude state, etc. of the observables.
  • Link: PennyLane
  • Measurement 1) Sampling result from each shots
In [47]:
# Measurement 1) Sampling result from each shots
dev = qml.device("default.qubit", wires=2, shots=1000)

@qml.qnode(dev)
def circuit():
    qml.Hadamard(wires=0)
    qml.CNOT(wires=[0, 1])
    return qml.sample(qml.PauliZ(0)), qml.sample(qml.PauliZ(1))


fig, ax = qml.draw_mpl(circuit)()
plt.show()
In [48]:
result_sample = circuit()

# Print out the dimension of the result
print(result_sample.shape)

# Let's look at the result
result_sample
(2, 1000)
Out[48]:
tensor([[-1, -1,  1, ...,  1,  1,  1],
        [-1, -1,  1, ...,  1,  1,  1]], dtype=int64, requires_grad=True)
  • Each wire is represented in row direction, and the results from each shots are represented in column direction.
  • In PennyLane, the binary value of the result 1 corresponds to $|0\rangle$, and -1 corresponds to $|1\rangle$.
  • Measurement 2) Sampling result from each shots
In [49]:
# Measurement 2) Sampling result from each shots
dev = qml.device("default.qubit", wires=2, shots=1000)

@qml.qnode(dev)
def circuit():
    qml.Hadamard(wires=0)
    qml.CNOT(wires=[0, 1])
    return qml.probs(wires=[0, 1])

fig, ax = qml.draw_mpl(circuit)()
plt.show()
In [50]:
result_prob = circuit()

# Print out the dimension of the result
print(result_prob.shape)

# Let's look at the result
result_prob
(4,)
Out[50]:
tensor([0.471, 0.   , 0.   , 0.529], requires_grad=True)
  • Since we used 2 wires (2 qubits), the result of the circuit is consisting of empirical probabilty of each $|00\rangle, |01\rangle, |10\rangle, |11\rangle$ (same order of increasing binary number, or $|q_0 q_1\rangle=|q_0\rangle\otimes|q_1\rangle$).
  • For this case, the result of the circuit represents $\big[Pr(|00\rangle), Pr(|01\rangle), Pr(|10\rangle), Pr(|11\rangle)\big]$
  • Since the final quantum state of this circuit is $|00\rangle \mapsto \frac{1}{\sqrt{2}}\big(|00\rangle+|11\rangle\big)$, we can see that the probabilty of observing $|00\rangle$ and $|11\rangle$ equals to 50%, and 0% for other states.
  • Measurement 3) Tensor observables / Expectation
In [51]:
# Measurement 3) Sampling result from each shots
dev = qml.device("default.qubit", wires=3, shots=1000)

@qml.qnode(dev)
def my_quantum_function(x, y):
    qml.RZ(x, wires=0)
    qml.CNOT(wires=[0, 1])
    qml.RY(y, wires=1)
    qml.CNOT(wires=[0, 2])
    return qml.expval(qml.PauliZ(0) @ qml.Identity(1) @qml.PauliX(2))

fig, ax = qml.draw_mpl(my_quantum_function)(x = pi/2, y = pi/3)
plt.show()
In [52]:
result_tensor = my_quantum_function(x = pi/2, y = pi/3)

# Let's look at the result
result_tensor
Out[52]:
tensor(-0.004, requires_grad=True)
In [ ]:
 

4.2 Variational classifier using PennyLane¶

  • We show how to use PennyLane to implement variational quantum classifiers - quantum circuits that can be trained from labelled data to classify new data samples.
  • Link: PennyLane - Variational classifier
  • We will first show that the variational quantum classifier can reproduce the parity function (also known as Boolean function whose value is one if and only if the input vector has an odd number of ones.
$$ f:x \in \big\{0,1\big\}^{\otimes n} \rightarrow y = \begin{cases} 1, & \mbox{if }\mbox{ uneven number of ones in x} \\ 0, & \mbox{if }\mbox{ otherwise} \end{cases} $$
  • This optimization example demonstrates how to encode binary inputs into the initial state of the variational circuit, which is simply a computational basis state.

  • We then show how to encode real vectors as amplitude vectors (amplitude encoding) and train the model to recognize the first two classes of flowers in the Iris dataset.

4.2.1 Fitting the parity function¶

Load packages / functions

  • We import PennyLane, the PennyLane-provided version of NumPy, and an optimizer.
In [53]:
import pennylane as qml
from pennylane import numpy as np
from pennylane.optimize import NesterovMomentumOptimizer

4.2.1.1 Quantum and classical nodes¶

  • We create a quantum device with four “wires” (or qubits).
In [54]:
dev = qml.device("default.qubit", wires=4)
  • Variational classifiers usually define a “layer” or “block”, which is an elementary circuit architecture that gets repeated to build the variational circuit.

  • Our circuit layer consists of an arbitrary rotation on every qubit, as well as CNOTs that entangle each qubit with its neighbor.

  • qml.Rot is a code function that performs an arbitrary rotational gate operator.

In [55]:
def layer(W):
    qml.Rot(W[0, 0], W[0, 1], W[0, 2], wires=0)
    qml.Rot(W[1, 0], W[1, 1], W[1, 2], wires=1)
    qml.Rot(W[2, 0], W[2, 1], W[2, 2], wires=2)
    qml.Rot(W[3, 0], W[3, 1], W[3, 2], wires=3)

    qml.CNOT(wires=[0, 1])
    qml.CNOT(wires=[1, 2])
    qml.CNOT(wires=[2, 3])
    qml.CNOT(wires=[3, 0])

We also need a way to encode data inputs $x$ into the circuit, so that the measured output depends on the inputs. In this first example, the inputs are bitstrings, which we encode into the state of the qubits. The quantum state $\psi$ after state preparation is a computational basis state that has 1s where $x$ has 1s, for example,

$$ x = 0101 \rightarrow \vert\psi\rangle = \vert0101\rangle. $$

We use the BasisState function provided by PennyLane, which expects $x$ to be a list of zeros and ones, i.e. [0,1,0,1].

In [56]:
def statepreparation(x):
    qml.BasisState(x, wires=[0, 1, 2, 3])

Now we define the quantum node as a state preparation routine, followed by a repetition of the layer structure. Borrowing from machine learning, we call the parameters weights.

In [57]:
@qml.qnode(dev)
def circuit(weights, x):

    statepreparation(x)

    for W in weights:
        layer(W)

    return qml.expval(qml.PauliZ(0))

Different from previous examples, the quantum node takes the data as a keyword argument $x$ (with the default value None). Keyword arguments of a quantum node are considered as fixed when calculating a gradient; they are never trained.

If we want to add a “classical” bias parameter, the variational quantum classifier also needs some post-processing. We define the final model by a classical node that uses the first variable, and feeds the remainder into the quantum node. Before this, we reshape the list of remaining variables for easy use in the quantum node.

In [58]:
def variational_classifier(weights, bias, x):
    return circuit(weights, x) + bias

4.2.1.2 Cost¶

  • In supervised learning, the cost function is usually the sum of a loss function and a regularizer. We use the standard square loss that measures the distance between target labels and model predictions.
In [59]:
def square_loss(labels, predictions):
    loss = 0
    for l, p in zip(labels, predictions):
        loss = loss + (l - p) ** 2

    loss = loss / len(labels)
    return loss
  • To monitor how many inputs the current classifier predicted correctly, we also define the accuracy given target labels and model predictions.
In [60]:
def accuracy(labels, predictions):

    loss = 0
    for l, p in zip(labels, predictions):
        if abs(l - p) < 1e-5:
            loss = loss + 1
    loss = loss / len(labels)

    return loss
  • For learning tasks, the cost depends on the data - here the features and labels considered in the iteration of the optimization routine.
In [61]:
def cost(weights, bias, X, Y):
    predictions = [variational_classifier(weights, bias, x) for x in X]
    return square_loss(Y, predictions)

4.2.1.3 Optimization¶

  • Let's now load and preprocess some data.
In [62]:
from pennylane import numpy as np

data = np.array([[0., 0., 0., 0., 0],
                 [0., 0., 0., 1., 1],
                 [0., 0., 1., 0., 1],
                 [0., 0., 1., 1., 0],
                 [0., 1., 0., 0., 1],
                 [0., 1., 0., 1., 0],
                 [0., 1., 1., 0., 0],
                 [0., 1., 1., 1., 1],
                 [1., 0., 0., 0., 1],
                 [1., 0., 0., 1., 0],
                 [1., 0., 1., 0., 0],
                 [1., 0., 1., 1., 1],
                 [1., 1., 0., 0., 0],
                 [1., 1., 0., 1., 1],
                 [1., 1., 1., 0., 1],
                 [1., 1., 1., 1., 0]])
X = np.array(data[:, :-1], requires_grad=False)
Y = np.array(data[:, -1], requires_grad=False)
Y = Y * 2 - np.ones(len(Y))  # shift label from {0, 1} to {-1, 1}

for i in range(5):
    print("X = {}, Y = {: d}".format(X[i], int(Y[i])))

print("...")
X = [0. 0. 0. 0.], Y = -1
X = [0. 0. 0. 1.], Y =  1
X = [0. 0. 1. 0.], Y =  1
X = [0. 0. 1. 1.], Y = -1
X = [0. 1. 0. 0.], Y =  1
...
  • We initialize the variables randomly (but fix a seed for reproducibility). The first variable in the list is used as a bias, while the rest is fed into the gates of the variational circuit.
In [63]:
np.random.seed(0)
num_qubits = 4
num_layers = 2
weights_init = 0.01 * np.random.randn(num_layers, num_qubits, 3, requires_grad=True)
bias_init = np.array(0.0, requires_grad=True)

print(weights_init, bias_init)
[[[ 0.01764052  0.00400157  0.00978738]
  [ 0.02240893  0.01867558 -0.00977278]
  [ 0.00950088 -0.00151357 -0.00103219]
  [ 0.00410599  0.00144044  0.01454274]]

 [[ 0.00761038  0.00121675  0.00443863]
  [ 0.00333674  0.01494079 -0.00205158]
  [ 0.00313068 -0.00854096 -0.0255299 ]
  [ 0.00653619  0.00864436 -0.00742165]]] 0.0

Next we create an optimizer and choose a batch size…

In [64]:
opt = NesterovMomentumOptimizer(0.5)
batch_size = 5

…and train the optimizer. We track the accuracy - the share of correctly classified data samples. For this we compute the outputs of the variational classifier and turn them into predictions in $\big\{-1, 1\big\}$ by taking the sign of the output.

In [65]:
weights = weights_init
bias = bias_init
iter_num = 10
for it in range(iter_num):

    # Update the weights by one optimizer step
    batch_index = np.random.randint(0, len(X), (batch_size,))
    X_batch = X[batch_index]
    Y_batch = Y[batch_index]
    weights, bias, _, _ = opt.step(cost, weights, bias, X_batch, Y_batch)

    # Compute accuracy
    predictions = [np.sign(variational_classifier(weights, bias, x)) for x in X]
    acc = accuracy(Y, predictions)

    print(
        "Iter: {:5d} | Cost: {:0.7f} | Accuracy: {:0.7f} ".format(
            it + 1, cost(weights, bias, X, Y), acc
        )
    )
Iter:     1 | Cost: 3.4355534 | Accuracy: 0.5000000 
Iter:     2 | Cost: 1.9717733 | Accuracy: 0.5000000 
Iter:     3 | Cost: 1.8182812 | Accuracy: 0.5000000 
Iter:     4 | Cost: 1.5042404 | Accuracy: 0.5000000 
Iter:     5 | Cost: 1.1477739 | Accuracy: 0.5000000 
Iter:     6 | Cost: 1.2734990 | Accuracy: 0.6250000 
Iter:     7 | Cost: 0.8290628 | Accuracy: 0.5000000 
Iter:     8 | Cost: 0.3226183 | Accuracy: 1.0000000 
Iter:     9 | Cost: 0.1436206 | Accuracy: 1.0000000 
Iter:    10 | Cost: 0.2982810 | Accuracy: 1.0000000 

4.2.2 Dataset: Iris and its classification¶

4.2.2.1 Quantum and classical nodes¶

To encode real-valued vectors into the amplitudes of a quantum state, we use a 2-qubit simulator.

In [66]:
dev = qml.device("default.qubit", wires=2)
  • State preparation is not as simple as when we represent a bitstring with a basis state. Every input $x$ has to be translated into a set of angles which can get fed into a small routine for state preparation. To simplify things a bit, we will work with data from the positive subspace, so that we can ignore signs (which would require another cascade of rotations around the $z$ axis).
In [67]:
def get_angles(x):

    beta0 = 2 * np.arcsin(np.sqrt(x[1] ** 2) / np.sqrt(x[0] ** 2 + x[1] ** 2 + 1e-12))
    beta1 = 2 * np.arcsin(np.sqrt(x[3] ** 2) / np.sqrt(x[2] ** 2 + x[3] ** 2 + 1e-12))
    beta2 = 2 * np.arcsin(
        np.sqrt(x[2] ** 2 + x[3] ** 2)
        / np.sqrt(x[0] ** 2 + x[1] ** 2 + x[2] ** 2 + x[3] ** 2)
    )

    return np.array([beta2, -beta1 / 2, beta1 / 2, -beta0 / 2, beta0 / 2])


def statepreparation(a):
    qml.RY(a[0], wires=0)

    qml.CNOT(wires=[0, 1])
    qml.RY(a[1], wires=1)
    qml.CNOT(wires=[0, 1])
    qml.RY(a[2], wires=1)

    qml.PauliX(wires=0)
    qml.CNOT(wires=[0, 1])
    qml.RY(a[3], wires=1)
    qml.CNOT(wires=[0, 1])
    qml.RY(a[4], wires=1)
    qml.PauliX(wires=0)
  • Let’s test if this routine actually works.
In [68]:
x = np.array([0.53896774, 0.79503606, 0.27826503, 0.0], requires_grad=False)
ang = get_angles(x)


@qml.qnode(dev)
def test(angles):
    statepreparation(angles)
    return qml.expval(qml.PauliZ(0))

test(ang)

print("x               : ", x)
print("angles          : ", ang)
print("amplitude vector: ", np.real(dev.state))
x               :  [0.53896774 0.79503606 0.27826503 0.        ]
angles          :  [ 0.56397465 -0.          0.         -0.97504604  0.97504604]
amplitude vector:  [ 5.38967743e-01  7.95036065e-01  2.78265032e-01 -2.77555756e-17]

Note that the default.qubit simulator provides a shortcut to statepreparation with the command qml.QubitStateVector(x, wires=[0, 1]). However, some devices may not support an arbitrary state-preparation routine.

Since we are working with only 2 qubits now, we need to update the layer function as well.

In [69]:
def layer(W):
    qml.Rot(W[0, 0], W[0, 1], W[0, 2], wires=0)
    qml.Rot(W[1, 0], W[1, 1], W[1, 2], wires=1)
    qml.CNOT(wires=[0, 1])

The variational classifier model and its cost remain essentially the same, but we have to reload them with the new state preparation and layer functions.

In [70]:
@qml.qnode(dev)
def circuit(weights, angles):
    statepreparation(angles)

    for W in weights:
        layer(W)

    return qml.expval(qml.PauliZ(0))


def variational_classifier(weights, bias, angles):
    return circuit(weights, angles) + bias


def cost(weights, bias, features, labels):
    predictions = [variational_classifier(weights, bias, f) for f in features]
    return square_loss(labels, predictions)

4.2.2.2 Data: Iris (붓꽃 자료)¶

  • We then load the Iris data set. There is a bit of preprocessing to do in order to encode the inputs into the amplitudes of a quantum state. In the last preprocessing step, we translate the inputs x to rotation angles using the get_angles function we defined above.

iris_flowers.png

  • X variables: [Sepal length, Sepal width, Petal length, Petal width], (각 Sepal: 꽃받침, Petal: 꽃잎 의 가로, 세로 길이)
  • Y variable: Species of iris flowers (0:"setosa", 1:"versicolor", 2:"virginica")
  • We are trying to classify iris flowers to correct type of iris flowers using the lengths of various parts of the flower.
In [71]:
# !pip install pandas
# !pip install sklearn
In [72]:
from sklearn.datasets import load_iris
import pandas as pd
import numpy as np

# visualization package
import matplotlib.pyplot as plt
import seaborn as sns

iris = load_iris() # sample data load
#print(iris) # pring out data with variable types and its description
In [73]:
print(iris.DESCR) # Description of the dataset
.. _iris_dataset:

Iris plants dataset
--------------------

**Data Set Characteristics:**

    :Number of Instances: 150 (50 in each of three classes)
    :Number of Attributes: 4 numeric, predictive attributes and the class
    :Attribute Information:
        - sepal length in cm
        - sepal width in cm
        - petal length in cm
        - petal width in cm
        - class:
                - Iris-Setosa
                - Iris-Versicolour
                - Iris-Virginica
                
    :Summary Statistics:

    ============== ==== ==== ======= ===== ====================
                    Min  Max   Mean    SD   Class Correlation
    ============== ==== ==== ======= ===== ====================
    sepal length:   4.3  7.9   5.84   0.83    0.7826
    sepal width:    2.0  4.4   3.05   0.43   -0.4194
    petal length:   1.0  6.9   3.76   1.76    0.9490  (high!)
    petal width:    0.1  2.5   1.20   0.76    0.9565  (high!)
    ============== ==== ==== ======= ===== ====================

    :Missing Attribute Values: None
    :Class Distribution: 33.3% for each of 3 classes.
    :Creator: R.A. Fisher
    :Donor: Michael Marshall (MARSHALL%PLU@io.arc.nasa.gov)
    :Date: July, 1988

The famous Iris database, first used by Sir R.A. Fisher. The dataset is taken
from Fisher's paper. Note that it's the same as in R, but not as in the UCI
Machine Learning Repository, which has two wrong data points.

This is perhaps the best known database to be found in the
pattern recognition literature.  Fisher's paper is a classic in the field and
is referenced frequently to this day.  (See Duda & Hart, for example.)  The
data set contains 3 classes of 50 instances each, where each class refers to a
type of iris plant.  One class is linearly separable from the other 2; the
latter are NOT linearly separable from each other.

.. topic:: References

   - Fisher, R.A. "The use of multiple measurements in taxonomic problems"
     Annual Eugenics, 7, Part II, 179-188 (1936); also in "Contributions to
     Mathematical Statistics" (John Wiley, NY, 1950).
   - Duda, R.O., & Hart, P.E. (1973) Pattern Classification and Scene Analysis.
     (Q327.D83) John Wiley & Sons.  ISBN 0-471-22361-1.  See page 218.
   - Dasarathy, B.V. (1980) "Nosing Around the Neighborhood: A New System
     Structure and Classification Rule for Recognition in Partially Exposed
     Environments".  IEEE Transactions on Pattern Analysis and Machine
     Intelligence, Vol. PAMI-2, No. 1, 67-71.
   - Gates, G.W. (1972) "The Reduced Nearest Neighbor Rule".  IEEE Transactions
     on Information Theory, May 1972, 431-433.
   - See also: 1988 MLC Proceedings, 54-64.  Cheeseman et al"s AUTOCLASS II
     conceptual clustering system finds 3 classes in the data.
   - Many, many more ...
In [74]:
# feature_names(x variables) 와 target(y variable)을 잘 나타내도록 data frame 생성
df = pd.DataFrame(data=iris.data, columns=iris.feature_names)
df['target'] = iris.target

# 숫자형으로 기록된 target(y variable) - (0.0, 1.0, 2.0)을 문자형으로 변환
df['target'] = df['target'].map({0:"setosa", 1:"versicolor", 2:"virginica"})
print(df)

sns.pairplot(df, hue="target", height=3)
plt.show()
     sepal length (cm)  sepal width (cm)  petal length (cm)  petal width (cm)  \
0                  5.1               3.5                1.4               0.2   
1                  4.9               3.0                1.4               0.2   
2                  4.7               3.2                1.3               0.2   
3                  4.6               3.1                1.5               0.2   
4                  5.0               3.6                1.4               0.2   
..                 ...               ...                ...               ...   
145                6.7               3.0                5.2               2.3   
146                6.3               2.5                5.0               1.9   
147                6.5               3.0                5.2               2.0   
148                6.2               3.4                5.4               2.3   
149                5.9               3.0                5.1               1.8   

        target  
0       setosa  
1       setosa  
2       setosa  
3       setosa  
4       setosa  
..         ...  
145  virginica  
146  virginica  
147  virginica  
148  virginica  
149  virginica  

[150 rows x 5 columns]

4.2.2.3 Data preparation to load to quantum circuit¶

In [75]:
from pennylane import numpy as np

data = np.loadtxt("iris_classes1and2_scaled.txt")
X = data[:, 0:2]
print("First X sample (original)  :", X[0])

# pad the vectors to size 2^2 with constant values
padding = 0.3 * np.ones((len(X), 1))
X_pad = np.c_[np.c_[X, padding], np.zeros((len(X), 1))]
print("First X sample (padded)    :", X_pad[0])

# normalize each input
normalization = np.sqrt(np.sum(X_pad ** 2, -1))
X_norm = (X_pad.T / normalization).T
print("First X sample (normalized):", X_norm[0])

# angles for state preparation are new features
features = np.array([get_angles(x) for x in X_norm], requires_grad=False)
print("First features sample      :", features[0])

Y = data[:, -1]
First X sample (original)  : [0.4  0.75]
First X sample (padded)    : [0.4  0.75 0.3  0.  ]
First X sample (normalized): [0.44376016 0.83205029 0.33282012 0.        ]
First features sample      : [ 0.67858523 -0.          0.         -1.080839    1.080839  ]
  • These angles are our new features, which is why we have renamed X to “features” above. Let’s plot the stages of preprocessing and play around with the dimensions (dim1, dim2). Some of them still separate the classes well, while others are less informative.
In [76]:
import matplotlib.pyplot as plt

plt.figure()
plt.scatter(X[:, 0][Y == 1], X[:, 1][Y == 1], c="b", marker="o", edgecolors="k")
plt.scatter(X[:, 0][Y == -1], X[:, 1][Y == -1], c="r", marker="o", edgecolors="k")
plt.title("Original data")
plt.show()

plt.figure()
dim1 = 0
dim2 = 1
plt.scatter(
    X_norm[:, dim1][Y == 1], X_norm[:, dim2][Y == 1], c="b", marker="o", edgecolors="k"
)
plt.scatter(
    X_norm[:, dim1][Y == -1], X_norm[:, dim2][Y == -1], c="r", marker="o", edgecolors="k"
)
plt.title("Padded and normalised data (dims {} and {})".format(dim1, dim2))
plt.show()

plt.figure()
dim1 = 0
dim2 = 3
plt.scatter(
    features[:, dim1][Y == 1], features[:, dim2][Y == 1], c="b", marker="o", edgecolors="k"
)
plt.scatter(
    features[:, dim1][Y == -1], features[:, dim2][Y == -1], c="r", marker="o", edgecolors="k"
)
plt.title("Feature vectors (dims {} and {})".format(dim1, dim2))
plt.show()
  • This time we want to generalize from the data samples. To monitor the generalization performance, the data is split into training and validation set.
In [77]:
np.random.seed(0)
num_data = len(Y)
num_train = int(0.75 * num_data)
index = np.random.permutation(range(num_data))
feats_train = features[index[:num_train]]
Y_train = Y[index[:num_train]]
feats_val = features[index[num_train:]]
Y_val = Y[index[num_train:]]

# We need these later for plotting
X_train = X[index[:num_train]]
X_val = X[index[num_train:]]

4.2.2.4 Optimization¶

  • First we initialize the variables.
In [78]:
num_qubits = 2
num_layers = 6

weights_init = 0.01 * np.random.randn(num_layers, num_qubits, 3, requires_grad=True)
bias_init = np.array(0.0, requires_grad=True)
  • Again we optimize the cost. This may take a little patience.
In [79]:
opt = NesterovMomentumOptimizer(0.01)
batch_size = 50

# train the variational classifier
weights = weights_init
bias = bias_init
Iter_num = 25

for it in range(Iter_num):

    # Update the weights by one optimizer step
    batch_index = np.random.randint(0, num_train, (batch_size,))
    feats_train_batch = feats_train[batch_index]
    Y_train_batch = Y_train[batch_index]
    weights, bias, _, _ = opt.step(cost, weights, bias, feats_train_batch, Y_train_batch)

    # Compute predictions on train and validation set
    predictions_train = [np.sign(variational_classifier(weights, bias, f)) for f in feats_train]
    predictions_val = [np.sign(variational_classifier(weights, bias, f)) for f in feats_val]

    # Compute accuracy on train and validation set
    acc_train = accuracy(Y_train, predictions_train)
    acc_val = accuracy(Y_val, predictions_val)

    print(
        "Iter: {:5d} | Cost: {:0.7f} | Acc train: {:0.7f} | Acc validation: {:0.7f} "
        "".format(it + 1, cost(weights, bias, features, Y), acc_train, acc_val)
    )
Iter:     1 | Cost: 1.4647420 | Acc train: 0.4933333 | Acc validation: 0.5600000 
Iter:     2 | Cost: 1.3817093 | Acc train: 0.4933333 | Acc validation: 0.5600000 
Iter:     3 | Cost: 1.2571520 | Acc train: 0.4800000 | Acc validation: 0.5600000 
Iter:     4 | Cost: 1.1050173 | Acc train: 0.4533333 | Acc validation: 0.5600000 
Iter:     5 | Cost: 0.9704723 | Acc train: 0.4933333 | Acc validation: 0.6000000 
Iter:     6 | Cost: 0.8841282 | Acc train: 0.5866667 | Acc validation: 0.7600000 
Iter:     7 | Cost: 0.8459409 | Acc train: 0.7200000 | Acc validation: 0.7200000 
Iter:     8 | Cost: 0.8408246 | Acc train: 0.7333333 | Acc validation: 0.6800000 
Iter:     9 | Cost: 0.8453070 | Acc train: 0.6400000 | Acc validation: 0.6400000 
Iter:    10 | Cost: 0.8410790 | Acc train: 0.6266667 | Acc validation: 0.6400000 
Iter:    11 | Cost: 0.8081787 | Acc train: 0.6400000 | Acc validation: 0.6400000 
Iter:    12 | Cost: 0.7677528 | Acc train: 0.7200000 | Acc validation: 0.6800000 
Iter:    13 | Cost: 0.7381111 | Acc train: 0.8266667 | Acc validation: 0.7200000 
Iter:    14 | Cost: 0.7194719 | Acc train: 0.8800000 | Acc validation: 1.0000000 
Iter:    15 | Cost: 0.7206167 | Acc train: 0.8133333 | Acc validation: 0.9200000 
Iter:    16 | Cost: 0.7321430 | Acc train: 0.7866667 | Acc validation: 0.8000000 
Iter:    17 | Cost: 0.7440584 | Acc train: 0.7733333 | Acc validation: 0.7600000 
Iter:    18 | Cost: 0.7489444 | Acc train: 0.7600000 | Acc validation: 0.7600000 
Iter:    19 | Cost: 0.7472663 | Acc train: 0.7733333 | Acc validation: 0.7600000 
Iter:    20 | Cost: 0.7414264 | Acc train: 0.7866667 | Acc validation: 0.7600000 
Iter:    21 | Cost: 0.7340591 | Acc train: 0.8266667 | Acc validation: 0.9200000 
Iter:    22 | Cost: 0.7235194 | Acc train: 0.8266667 | Acc validation: 0.9200000 
Iter:    23 | Cost: 0.7159816 | Acc train: 0.8933333 | Acc validation: 0.9600000 
Iter:    24 | Cost: 0.7110693 | Acc train: 0.8933333 | Acc validation: 1.0000000 
Iter:    25 | Cost: 0.7064306 | Acc train: 0.8933333 | Acc validation: 1.0000000 

4.2.2.5 Result¶

  • We can plot the continuous output of the variational classifier for the first two dimensions of the Iris data set.
In [80]:
plt.figure(figsize=(12,9))
cm = plt.cm.RdBu

# make data for decision regions
xx, yy = np.meshgrid(np.linspace(0.0, 1.5, 20), np.linspace(0.0, 1.5, 20))
X_grid = [np.array([x, y]) for x, y in zip(xx.flatten(), yy.flatten())]

# preprocess grid points like data inputs above
padding = 0.3 * np.ones((len(X_grid), 1))
X_grid = np.c_[np.c_[X_grid, padding], np.zeros((len(X_grid), 1))]  # pad each input
normalization = np.sqrt(np.sum(X_grid ** 2, -1))
X_grid = (X_grid.T / normalization).T  # normalize each input
features_grid = np.array(
    [get_angles(x) for x in X_grid]
)  # angles for state preparation are new features
predictions_grid = [variational_classifier(weights, bias, f) for f in features_grid]
Z = np.reshape(predictions_grid, xx.shape)

# plot decision regions
cnt = plt.contourf(
    xx, yy, Z, levels=np.arange(-1, 1.1, 0.1), cmap=cm, alpha=0.8, extend="both"
)
plt.contour(
    xx, yy, Z, levels=[0.0], colors=("black",), linestyles=("--",), linewidths=(0.8,)
)
plt.colorbar(cnt, ticks=[-1, 0, 1])

# plot data
plt.scatter(
    X_train[:, 0][Y_train == 1],
    X_train[:, 1][Y_train == 1],
    c="b",
    marker="o",
    edgecolors="k",
    label="class 1 train",
)
plt.scatter(
    X_val[:, 0][Y_val == 1],
    X_val[:, 1][Y_val == 1],
    c="b",
    marker="^",
    edgecolors="k",
    label="class 1 validation",
)
plt.scatter(
    X_train[:, 0][Y_train == -1],
    X_train[:, 1][Y_train == -1],
    c="r",
    marker="o",
    edgecolors="k",
    label="class -1 train",
)
plt.scatter(
    X_val[:, 0][Y_val == -1],
    X_val[:, 1][Y_val == -1],
    c="r",
    marker="^",
    edgecolors="k",
    label="class -1 validation",
)

plt.legend()
plt.show()
  • When more iteration is performed, say 60 iteration times, we get better optimized model, better prediecting model.

variational_classifier_iris_iter60_result_plot.PNG

In [ ]:
 

4.3 Quantum Convolutional Neural Network (QCNN) using PennyLane¶

  • Here, we will represent simple model of Quantum convolutional neural network (QCNN), and compare it with classical convolutional neural network (CNN) model.

  • Link: PennyLane - Tutorial on QCNN

4.3.1 Introduction: Classical convolution vs. Quantum convolution¶

Classical convolutional neural network

CNN_process.png

  • The convolutional neural network (CNN, 합성곱 신경망 모형) is a standard model in classical machine learning which is particularly suitable for processing images.
  • The model is based on the idea of a convolution layer where, instead of processing the full input data with a global function, a local convolution is applied.
  • If the input is an image, small local regions are sequentially processed with the same kernel. The results obtained for each region are usually associated to different channels of a single output pixel. The union of all the output pixels produces a new image-like object, which can be further processed by additional layers.

  • In summary, there are mainly three parts, convolutional layer, pooling, and classification.

    • convolutional layer: In a nutshell, convolution (합성곱) is a mathematical operator where taking many data values and combine them into different values. "Highlighting the valuable part of the information."
    • pooling: "Extracting valuable information from the data, and then resizing them into smaller size."
    • Classification: At the end, we apply the neural network model (신경망 모형) to classify 'filtered' data into given classes.

Quantum convolutional neural network

QCNN_process.png

  • Just like classical convolutional neural network, we can apply the similar architecture of the model into quantum circuit. A simple scheme is also represented in the figure at the top.

    • 1) A small region of the input image, in our example a $2x2$ square, is embedded into a quantum circuit. In this demo, this is achieved with parametrized rotations applied to the qubits initialized in the ground state. This step corresponds to convolutional layer in classical CNN.
    • For simplicity of the quantum circuit, there are no pooling steps in this process.
    • 2) A quantum computation, associated to a unitary $U$, is performed on the system. The unitary could be generated by a variational quantum circuit or, more simply, by a random circuit.
    • 3) The quantum system is finally measured, obtaining a list of classical expectation values. The measurement results could also be classically post-processed. But, for simplicity, in this demo we directly use the raw expectation values.
    • 4) Analogously to a classical convolution layer, each expectation value is mapped to a different channel of a single output pixel.
    • From 2) to 4), this step corresponds to classification step in classical CNN.
    • 5) Iterating the same procedure over different regions, one can scan the full input image, producing an output object which will be structured as a multi-channel image.
    • 6) The quantum convolution can be followed by further quantum layers or by classical layers.
  • The main difference with respect to a classical convolution is that a quantum circuit can generate highly complex kernels whose computation could be, at least in principle, classically intractable.

4.3.2 What is happening to qubits when we are optimizing a quantum circuit?¶

training_qubit_target_x.gif

  • Qubit / Bloch sphere: A qubit can be represented as a Bloch sphere.
  • State of a qubit: The state of a qubit is showed as the arrow in the Bloch sphere.
  • Moving state: When a quantum gate is applied on a qubit, the state changes as shown below.
  • X mark: The targeted 'location' on a Bloch sphere.

    $\Rightarrow$ We are trying to find 'the best mapping quantum gate' which finds 'the targeted X-mark'.

4.3.3 Loading packages / data:¶

  • Here, we will be constructing QCNN model using PennyLane package, using MNIST data, which is an image data of hand-written one digit numbers, such as 0, 1, 2, 3, ... ,8, 9.
  • This Python code requires PennyLane with the TensorFlow interface and the plotting library matplotlib.

Install / loading packages

In [81]:
# Install packages
# !pip install tensorflow
In [82]:
# Loading packages
import pennylane as qml
from pennylane import numpy as np
from pennylane.templates import RandomLayers
import tensorflow as tf
from tensorflow import keras
import matplotlib.pyplot as plt

4.3.4 Setting of the main hyper-parameter of the model¶

  • Hyper-parameters are used in deciding the structure of the model, the number of observations of training/test datasets, and some numbers used in optimizing the model.
In [83]:
n_epochs = 30   # Number of optimization epochs
n_layers = 1    # Number of random layers
n_train = 50    # Size of the train dataset
n_test = 30     # Size of the test dataset

# SAVE_PATH = "quanvolution/" # Data saving folder. Here we don't need to make saving path.
PREPROCESS = True           # If False, skip quantum processing and load data from SAVE_PATH
np.random.seed(0)           # Seed for NumPy random number generator
tf.random.set_seed(0)       # Seed for TensorFlow random number generator

4.3.5 Loading MNIST dataset from tesnorflow-keras¶

MNIST_samples.png

  • Here, we are importing the MNIST dataset from Keras package.
  • To speed up the processing the algorithm, we will only use a small number of training and test image data.
  • Usually, better optimized model is achieved when we use the full dataset.
In [84]:
# Assigning MNIST data into train / test data
mnist_dataset = keras.datasets.mnist
(train_images, train_labels), (test_images, test_labels) = mnist_dataset.load_data()

# Reduce dataset size
## Previously, we set the number of train / test data in setting the hyper-parameters.
train_images = train_images[:n_train]
train_labels = train_labels[:n_train]
test_images = test_images[:n_test]
test_labels = test_labels[:n_test]

# Normalize pixel values within 0 and 1.
## We are modifying the data values between 0 and 1.
train_images = train_images / 255
test_images = test_images / 255

# Add extra dimension for convolution channels
train_images = np.array(train_images[..., tf.newaxis], requires_grad=False)
test_images = np.array(test_images[..., tf.newaxis], requires_grad=False)

4.3.6 Quantum circuit as a convolution kernel¶

  • Here, we will be building a QCNN circuit.

  • We follow the scheme described in the introduction and represented in the figure at the top.

  • We initialize a PennyLane default.qubit device, simulating a system of $4$ qubits. The associated qnode represents the quantum circuit consisting of:
      1. an embedding layer of local $R_y$ rotations (with angles scaled by a factor of $\pi$)
      1. a random circuit of n_layers, where we previously decided at hyper-parameter step.
      1. a final measurement in the computational basis, estimating $4$ expectation values.
In [85]:
# Setup the device using qml.device()
dev = qml.device("default.qubit", wires=4)
# Random circuit parameters using in rotation-Y gate
rand_params = np.random.uniform(high=2 * np.pi, size=(n_layers, 4))

@qml.qnode(dev)
def circuit(phi):
    # Encoding of 4 classical input values
    for j in range(4):
        qml.RY(np.pi * phi[j], wires=j)

    # Random quantum circuit
    RandomLayers(rand_params, wires=list(range(4)))

    # Measurement producing 4 classical output values
    return [qml.expval(qml.PauliZ(j)) for j in range(4)]

## Assigning phi_value to plot the circuit
phi_value = [0,0,0,0]

## Plotting the quantum circuit
fig, ax = qml.draw_mpl(circuit)(phi_value)
plt.show()
  • The next function defines the convolution scheme:
      1. the image is divided into squares of $2×2$ pixels
      1. each square is processed by the quantum circuit
      1. the $4$ expectation values are mapped into $4$ different channels of a single output pixel.
In [86]:
def quanv(image):
    """Convolves the input image with many applications of the same quantum circuit."""
    out = np.zeros((14, 14, 4))

    # Loop over the coordinates of the top-left pixel of 2X2 squares
    for j in range(0, 28, 2):
        for k in range(0, 28, 2):
            # Process a squared 2x2 region of the image with a quantum circuit
            q_results = circuit(
                [
                    image[j, k, 0],
                    image[j, k + 1, 0],
                    image[j + 1, k, 0],
                    image[j + 1, k + 1, 0]
                ]
            )
            # Assign expectation values to different channels of the output pixel (j/2, k/2)
            for c in range(4):
                out[j // 2, k // 2, c] = q_results[c]
    return out

4.3.7 Quantum pre-processing of the dataset¶

  • To observe the result of the quantum circuit faster, instead of running the circuit on quantum computer, we will be using "pre-processing" layer to all the image data.
  • Later, an entirely classical model will be trained and tested on the pre-processed dataset.
In [87]:
if PREPROCESS == True:
    q_train_images = []
    print("Quantum pre-processing of train images:")
    for idx, img in enumerate(train_images):
        print("{}/{}        ".format(idx + 1, n_train), end="\r")
        q_train_images.append(quanv(img))
    q_train_images = np.asarray(q_train_images)

    q_test_images = []
    print("\nQuantum pre-processing of test images:")
    for idx, img in enumerate(test_images):
        print("{}/{}        ".format(idx + 1, n_test), end="\r")
        q_test_images.append(quanv(img))
    q_test_images = np.asarray(q_test_images)

#     # Save pre-processed images
#     np.save(SAVE_PATH + "q_train_images.npy", q_train_images)
#     np.save(SAVE_PATH + "q_test_images.npy", q_test_images)


# # Load pre-processed images
# q_train_images = np.load(SAVE_PATH + "q_train_images.npy")
# q_test_images = np.load(SAVE_PATH + "q_test_images.npy")
Quantum pre-processing of train images:
50/50        
Quantum pre-processing of test images:
30/30        

4.3.8 Plotting the effect of QCNN of the data¶

  • We will be visualizing the effect of the quantum convolution layer on a batch of samples.
  • In quantum computers, this effect is not easy to see because the processing data will be collapsed when we 'observe' or measure the processing data.
In [88]:
n_samples = 4
n_channels = 4
fig, axes = plt.subplots(1 + n_channels, n_samples, figsize=(10, 10))
for k in range(n_samples):
    axes[0, 0].set_ylabel("Input")
    if k != 0:
        axes[0, k].yaxis.set_visible(False)
    axes[0, k].imshow(train_images[k, :, :, 0], cmap="gray")

    # Plot all output channels
    for c in range(n_channels):
        axes[c + 1, 0].set_ylabel("Output [ch. {}]".format(c))
        if k != 0:
            axes[c, k].yaxis.set_visible(False)
        axes[c + 1, k].imshow(q_train_images[k, :, :, c], cmap="gray")

plt.tight_layout()
plt.show()
  • Input: initial input data.

  • Output: From each inputs, the $4$ output channels generated by the quantum convolution are visualized in gray scale. Since we measured $4$ wires (qubits), there are $4$ outputs.

  • One can clearly notice the down-sampling of the resolution and some local distortion introduced by the quantum kernel.

  • On the other hand the global shape of the image is preserved, as expected for a convolution layer.

4.3.9 Hybrid quantum-classical model¶

  • Assuming that we have applied Quantum convolutional layers, we feed the resulting features into a classical neural network that will be trained to classify the $10$ different digits of the MNIST dataset.

  • We use a very simple model: just a fully connected layer with $10$ output nodes with a final softmax activation function.

    • softmax activation function: it assigns $k$-dimensional real numbers into $k$-dimensional $\big(0, 1\big)$, assigning into 'probability of being in each classes'
  • The model is compiled with a stochastic-gradient-descent optimizer, and a cross-entropy loss function.

    • cross-entropy loss function: It measures the difference between 'true distribution'(obtained from the true label of the dataset) and 'estimated distribution'(obtained from optimized (or predicted) model).
In [89]:
def MyModel():
    """Initializes and returns a custom Keras model
    which is ready to be trained."""
    model = keras.models.Sequential([
        keras.layers.Flatten(),
        keras.layers.Dense(10, activation="softmax")
    ])

    model.compile(
        optimizer='adam',
        loss="sparse_categorical_crossentropy",
        metrics=["accuracy"],
    )
    return model

4.3.10 Training the model¶

  • We first initialize an instance of the model, then we train and validate it with the dataset that has been already pre-processed by a quantum convolution.
In [90]:
q_model = MyModel()

q_history = q_model.fit(
    q_train_images,
    train_labels,
    validation_data=(q_test_images, test_labels),
    batch_size=4,
    epochs=n_epochs,
    verbose=2,
)
Epoch 1/30
13/13 - 0s - loss: 2.9656 - accuracy: 0.1800 - val_loss: 2.5706 - val_accuracy: 0.0333 - 407ms/epoch - 31ms/step
Epoch 2/30
13/13 - 0s - loss: 2.2246 - accuracy: 0.2200 - val_loss: 2.3467 - val_accuracy: 0.2333 - 25ms/epoch - 2ms/step
Epoch 3/30
13/13 - 0s - loss: 1.7292 - accuracy: 0.5000 - val_loss: 2.2192 - val_accuracy: 0.2333 - 24ms/epoch - 2ms/step
Epoch 4/30
13/13 - 0s - loss: 1.3182 - accuracy: 0.6000 - val_loss: 1.9675 - val_accuracy: 0.3333 - 24ms/epoch - 2ms/step
Epoch 5/30
13/13 - 0s - loss: 1.1159 - accuracy: 0.7000 - val_loss: 1.8490 - val_accuracy: 0.5000 - 24ms/epoch - 2ms/step
Epoch 6/30
13/13 - 0s - loss: 0.9147 - accuracy: 0.8000 - val_loss: 1.8180 - val_accuracy: 0.4000 - 25ms/epoch - 2ms/step
Epoch 7/30
13/13 - 0s - loss: 0.7116 - accuracy: 0.9000 - val_loss: 1.6675 - val_accuracy: 0.5333 - 26ms/epoch - 2ms/step
Epoch 8/30
13/13 - 0s - loss: 0.5623 - accuracy: 1.0000 - val_loss: 1.5872 - val_accuracy: 0.5333 - 25ms/epoch - 2ms/step
Epoch 9/30
13/13 - 0s - loss: 0.4813 - accuracy: 0.9800 - val_loss: 1.5275 - val_accuracy: 0.5000 - 24ms/epoch - 2ms/step
Epoch 10/30
13/13 - 0s - loss: 0.3913 - accuracy: 1.0000 - val_loss: 1.5443 - val_accuracy: 0.5333 - 25ms/epoch - 2ms/step
Epoch 11/30
13/13 - 0s - loss: 0.3646 - accuracy: 0.9800 - val_loss: 1.4344 - val_accuracy: 0.5333 - 27ms/epoch - 2ms/step
Epoch 12/30
13/13 - 0s - loss: 0.3153 - accuracy: 1.0000 - val_loss: 1.5359 - val_accuracy: 0.5333 - 27ms/epoch - 2ms/step
Epoch 13/30
13/13 - 0s - loss: 0.2676 - accuracy: 1.0000 - val_loss: 1.3962 - val_accuracy: 0.5667 - 27ms/epoch - 2ms/step
Epoch 14/30
13/13 - 0s - loss: 0.2488 - accuracy: 1.0000 - val_loss: 1.3335 - val_accuracy: 0.5333 - 26ms/epoch - 2ms/step
Epoch 15/30
13/13 - 0s - loss: 0.2048 - accuracy: 1.0000 - val_loss: 1.3704 - val_accuracy: 0.5667 - 25ms/epoch - 2ms/step
Epoch 16/30
13/13 - 0s - loss: 0.1807 - accuracy: 1.0000 - val_loss: 1.3774 - val_accuracy: 0.6000 - 25ms/epoch - 2ms/step
Epoch 17/30
13/13 - 0s - loss: 0.1735 - accuracy: 1.0000 - val_loss: 1.3150 - val_accuracy: 0.5667 - 25ms/epoch - 2ms/step
Epoch 18/30
13/13 - 0s - loss: 0.1499 - accuracy: 1.0000 - val_loss: 1.3285 - val_accuracy: 0.6000 - 26ms/epoch - 2ms/step
Epoch 19/30
13/13 - 0s - loss: 0.1383 - accuracy: 1.0000 - val_loss: 1.3154 - val_accuracy: 0.6000 - 26ms/epoch - 2ms/step
Epoch 20/30
13/13 - 0s - loss: 0.1279 - accuracy: 1.0000 - val_loss: 1.2621 - val_accuracy: 0.5667 - 29ms/epoch - 2ms/step
Epoch 21/30
13/13 - 0s - loss: 0.1171 - accuracy: 1.0000 - val_loss: 1.3197 - val_accuracy: 0.5667 - 28ms/epoch - 2ms/step
Epoch 22/30
13/13 - 0s - loss: 0.1077 - accuracy: 1.0000 - val_loss: 1.2748 - val_accuracy: 0.6000 - 26ms/epoch - 2ms/step
Epoch 23/30
13/13 - 0s - loss: 0.1007 - accuracy: 1.0000 - val_loss: 1.2443 - val_accuracy: 0.6000 - 27ms/epoch - 2ms/step
Epoch 24/30
13/13 - 0s - loss: 0.0939 - accuracy: 1.0000 - val_loss: 1.2383 - val_accuracy: 0.6000 - 25ms/epoch - 2ms/step
Epoch 25/30
13/13 - 0s - loss: 0.0900 - accuracy: 1.0000 - val_loss: 1.2418 - val_accuracy: 0.6000 - 27ms/epoch - 2ms/step
Epoch 26/30
13/13 - 0s - loss: 0.0809 - accuracy: 1.0000 - val_loss: 1.2435 - val_accuracy: 0.6000 - 25ms/epoch - 2ms/step
Epoch 27/30
13/13 - 0s - loss: 0.0774 - accuracy: 1.0000 - val_loss: 1.2226 - val_accuracy: 0.6000 - 27ms/epoch - 2ms/step
Epoch 28/30
13/13 - 0s - loss: 0.0758 - accuracy: 1.0000 - val_loss: 1.2228 - val_accuracy: 0.6000 - 26ms/epoch - 2ms/step
Epoch 29/30
13/13 - 0s - loss: 0.0699 - accuracy: 1.0000 - val_loss: 1.2535 - val_accuracy: 0.6000 - 29ms/epoch - 2ms/step
Epoch 30/30
13/13 - 0s - loss: 0.0652 - accuracy: 1.0000 - val_loss: 1.2156 - val_accuracy: 0.6000 - 27ms/epoch - 2ms/step
  • In order to compare the results achievable with and without the quantum convolution layer, we initialize also a “classical” instance of the model that will be directly trained and validated with the raw MNIST images (i.e., without quantum pre-processing).
In [91]:
c_model = MyModel()

c_history = c_model.fit(
    train_images,
    train_labels,
    validation_data=(test_images, test_labels),
    batch_size=4,
    epochs=n_epochs,
    verbose=2,
)
Epoch 1/30
13/13 - 0s - loss: 2.2957 - accuracy: 0.2000 - val_loss: 2.0767 - val_accuracy: 0.2333 - 277ms/epoch - 21ms/step
Epoch 2/30
13/13 - 0s - loss: 1.9250 - accuracy: 0.5000 - val_loss: 1.9342 - val_accuracy: 0.3333 - 26ms/epoch - 2ms/step
Epoch 3/30
13/13 - 0s - loss: 1.6363 - accuracy: 0.6600 - val_loss: 1.8042 - val_accuracy: 0.4667 - 24ms/epoch - 2ms/step
Epoch 4/30
13/13 - 0s - loss: 1.4091 - accuracy: 0.7400 - val_loss: 1.6767 - val_accuracy: 0.5333 - 26ms/epoch - 2ms/step
Epoch 5/30
13/13 - 0s - loss: 1.2103 - accuracy: 0.8200 - val_loss: 1.5733 - val_accuracy: 0.6667 - 27ms/epoch - 2ms/step
Epoch 6/30
13/13 - 0s - loss: 1.0519 - accuracy: 0.8600 - val_loss: 1.4936 - val_accuracy: 0.6667 - 26ms/epoch - 2ms/step
Epoch 7/30
13/13 - 0s - loss: 0.9178 - accuracy: 0.8800 - val_loss: 1.4337 - val_accuracy: 0.7000 - 29ms/epoch - 2ms/step
Epoch 8/30
13/13 - 0s - loss: 0.8044 - accuracy: 0.9200 - val_loss: 1.3763 - val_accuracy: 0.6667 - 27ms/epoch - 2ms/step
Epoch 9/30
13/13 - 0s - loss: 0.7090 - accuracy: 0.9200 - val_loss: 1.3249 - val_accuracy: 0.7000 - 26ms/epoch - 2ms/step
Epoch 10/30
13/13 - 0s - loss: 0.6302 - accuracy: 0.9400 - val_loss: 1.2921 - val_accuracy: 0.7000 - 26ms/epoch - 2ms/step
Epoch 11/30
13/13 - 0s - loss: 0.5685 - accuracy: 0.9600 - val_loss: 1.2517 - val_accuracy: 0.7000 - 27ms/epoch - 2ms/step
Epoch 12/30
13/13 - 0s - loss: 0.5127 - accuracy: 1.0000 - val_loss: 1.2427 - val_accuracy: 0.7000 - 26ms/epoch - 2ms/step
Epoch 13/30
13/13 - 0s - loss: 0.4631 - accuracy: 0.9800 - val_loss: 1.2180 - val_accuracy: 0.7000 - 26ms/epoch - 2ms/step
Epoch 14/30
13/13 - 0s - loss: 0.4186 - accuracy: 1.0000 - val_loss: 1.1807 - val_accuracy: 0.7000 - 25ms/epoch - 2ms/step
Epoch 15/30
13/13 - 0s - loss: 0.3806 - accuracy: 1.0000 - val_loss: 1.1570 - val_accuracy: 0.7000 - 26ms/epoch - 2ms/step
Epoch 16/30
13/13 - 0s - loss: 0.3485 - accuracy: 1.0000 - val_loss: 1.1436 - val_accuracy: 0.7000 - 25ms/epoch - 2ms/step
Epoch 17/30
13/13 - 0s - loss: 0.3215 - accuracy: 1.0000 - val_loss: 1.1284 - val_accuracy: 0.7000 - 24ms/epoch - 2ms/step
Epoch 18/30
13/13 - 0s - loss: 0.2963 - accuracy: 1.0000 - val_loss: 1.1084 - val_accuracy: 0.7000 - 24ms/epoch - 2ms/step
Epoch 19/30
13/13 - 0s - loss: 0.2727 - accuracy: 1.0000 - val_loss: 1.1065 - val_accuracy: 0.7000 - 26ms/epoch - 2ms/step
Epoch 20/30
13/13 - 0s - loss: 0.2534 - accuracy: 1.0000 - val_loss: 1.0893 - val_accuracy: 0.7000 - 28ms/epoch - 2ms/step
Epoch 21/30
13/13 - 0s - loss: 0.2357 - accuracy: 1.0000 - val_loss: 1.0872 - val_accuracy: 0.7000 - 28ms/epoch - 2ms/step
Epoch 22/30
13/13 - 0s - loss: 0.2192 - accuracy: 1.0000 - val_loss: 1.0760 - val_accuracy: 0.7333 - 28ms/epoch - 2ms/step
Epoch 23/30
13/13 - 0s - loss: 0.2043 - accuracy: 1.0000 - val_loss: 1.0675 - val_accuracy: 0.7667 - 30ms/epoch - 2ms/step
Epoch 24/30
13/13 - 0s - loss: 0.1911 - accuracy: 1.0000 - val_loss: 1.0596 - val_accuracy: 0.7667 - 29ms/epoch - 2ms/step
Epoch 25/30
13/13 - 0s - loss: 0.1806 - accuracy: 1.0000 - val_loss: 1.0540 - val_accuracy: 0.7667 - 28ms/epoch - 2ms/step
Epoch 26/30
13/13 - 0s - loss: 0.1686 - accuracy: 1.0000 - val_loss: 1.0482 - val_accuracy: 0.7667 - 30ms/epoch - 2ms/step
Epoch 27/30
13/13 - 0s - loss: 0.1594 - accuracy: 1.0000 - val_loss: 1.0403 - val_accuracy: 0.7667 - 29ms/epoch - 2ms/step
Epoch 28/30
13/13 - 0s - loss: 0.1513 - accuracy: 1.0000 - val_loss: 1.0340 - val_accuracy: 0.7667 - 27ms/epoch - 2ms/step
Epoch 29/30
13/13 - 0s - loss: 0.1424 - accuracy: 1.0000 - val_loss: 1.0339 - val_accuracy: 0.7667 - 30ms/epoch - 2ms/step
Epoch 30/30
13/13 - 0s - loss: 0.1343 - accuracy: 1.0000 - val_loss: 1.0294 - val_accuracy: 0.7667 - 29ms/epoch - 2ms/step

4.3.11 Prediction and model evaluation¶

In [92]:
import matplotlib.pyplot as plt

plt.style.use("seaborn")
fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(6, 9))

ax1.plot(q_history.history["val_accuracy"], "-ob", label="With quantum layer")
ax1.plot(c_history.history["val_accuracy"], "-og", label="Without quantum layer")
ax1.set_ylabel("Accuracy")
ax1.set_ylim([0, 1])
ax1.set_xlabel("Epoch")
ax1.legend()

ax2.plot(q_history.history["val_loss"], "-ob", label="With quantum layer")
ax2.plot(c_history.history["val_loss"], "-og", label="Without quantum layer")
ax2.set_ylabel("Loss")
ax2.set_ylim(top=2.5)
ax2.set_xlabel("Epoch")
ax2.legend()
plt.tight_layout()
plt.show()
In [ ]:
 

4.4 QCNN in development / research level¶

  • We will observe some works done by Professor Daniel K. Park.
    • "Quantum convolutional neural network for classical data (Quantum Machine Intelligence (2022))" by Tak Hur, Leeseok Kim, Daniel K. Park

4.4.1 QCNN models scheme¶

  • Similar to our previous exercise, the quantum circuit shown below is an Qauntum covolutional neural network (QCNN) model designed to classify 0 and 1 from MNIST dataset.

QCNN-optimizing.PNG

  • Here is an example the building blocks of the QCNN circuits.
    • More complicated than the previous quantum circuit.
    • DataEncoding: Uploading classical data into quantum state.
      • ex) Amplitude encoding, Qubit encoding, Dense encoding, etc.
    • Convolution: Convolutional circuit applied on neighboring qubits.
    • Pooling: Extracting the 'valuable' features after 'convolution' step.
    • Measurement: Measuring the last qubit to classify the data into 0 or 1.

4.4.2 Data Encoding: classical data into quantum data¶

  • Here is list of the Data-Encoding methods that could be applied.
    • Amplitude encoding
    • Qubit encoding
    • Dense qubit encoding
    • Hybrid qubit encoding(hybrid direct encoding, and hybrid angle encoding)

QCNN_data_encoding.png

4.4.3 Convolutional filter and Pooling layer¶

  • Here are list of the convolutional filters that could be applied.

QCNN_convolutional_circuit.PNG

  • Here is the pooling layer being applied.

QCNN_pooling_layer.PNG

4.4.4 Results of QCNN model¶

  • One of the example of the QCNN circuit structure is shown below.

QCNN_MNIST_0-1_amplitude_enc_Convol1_circuit.PNG

  • After applying quantum circuit on a quantum simulator, and optimizing the model using classical process, here are some of the results classifying 0 and 1.

QCNN_MNIST_0-1_cross_entropy_result.PNG

  • Similar to high accuracy of the classical CNN model when using MNIST dataset, QCNN model also performs with high accuracy.

QCNN_CNN_MNIST_0-1_model_comparison.png

  • One of the advantage of shown using QCNN model compared to classical CNN model is that optimized QCNN model is obtained faster than classical CNN. Notice the decreasing rate of loss values (cross-entropy loss) as the number of iteration increases.
In [ ]:
 
In [ ]: